LASAGNE: Locality And Structure Aware Graph Node Embedding

نویسندگان

  • Evgheniy Faerman
  • Felix Borutta
  • Kimon Fountoulakis
  • Michael W. Mahoney
چکیده

Recent work has attempted to identify structure in social and information graphs by using the following approach: first, use random walk methods to explore the neighborhood of a node; second, use ideas from natural language processing to use this neighborhood information to learn vector representations of these nodes reflecting properties of the graph. Informally, the idea is that if a node is a member of a meaningful cluster or community, then the vector representation should be higher-quality, thereby leading to improved learning. In this paper, we identify and remedy an important shortcoming of this approach. In particular, we show that the performance of existing methodologies depends strongly on the structural properties of the graph, e.g., the size of the graph, whether the graph has a flat or upward-sloping Network Community Profile (NCP), whether the graph is expander-like, whether the classes of interest are more k-core-like or more peripheral, etc. For larger graphs with flat NCPs that are strongly expander-like, existing methods lead to random walks that expand rapidly, touching many dissimilar nodes, thereby leading to lower-quality vector representations that are less useful for downstream tasks. Based on our findings, we propose Lasagne, a methodology to learn locality and structure aware graph node embeddings in an unsupervised way. Rather than relying on global random walks or neighbors within fixed hop distances, Lasagne exploits strongly local Approximate Personalized PageRank stationary distributions to more precisely engineer local information into node embeddings. This leads, in particular, to more meaningful and more useful vector representations of nodes in poorly-structured graphs. We show that Lasagne leads to significant improvement in downstream multi-label classification for larger graphs with flat NCPs, that it is comparable for smaller graphs with upward-sloping NCPs, and that is comparable to existing methods for link prediction tasks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Link Prediction using Network Embedding based on Global Similarity

Background: The link prediction issue is one of the most widely used problems in complex network analysis. Link prediction requires knowing the background of previous link connections and combining them with available information. The link prediction local approaches with node structure objectives are fast in case of speed but are not accurate enough. On the other hand, the global link predicti...

متن کامل

Detecting Overlapping Communities in Social Networks using Deep Learning

In network analysis, a community is typically considered of as a group of nodes with a great density of edges among themselves and a low density of edges relative to other network parts. Detecting a community structure is important in any network analysis task, especially for revealing patterns between specified nodes. There is a variety of approaches presented in the literature for overlapping...

متن کامل

Topology-Aware Parallelism for NUMA Copying Collectors

NUMA-aware parallel algorithms in runtime systems attempt to improve locality by allocating memory from local NUMA nodes. Researchers have suggested that the garbage collector should profile memory access patterns or use object locality heuristics to determine the target NUMA node before moving an object. However, these solutions are costly when applied to every live object in the reference gra...

متن کامل

Gemini: A Computation-Centric Distributed Graph Processing System

Traditionally distributed graph processing systems have largely focused on scalability through the optimizations of inter-node communication and load balance. However, they often deliver unsatisfactory overall processing efficiency compared with shared-memory graph computing frameworks. We analyze the behavior of several graph-parallel systems and find that the added overhead for achieving scal...

متن کامل

DagStream: Locality Aware and Failure Resilient Peer-to-Peer Streaming

Live peer to peer (P2P) media streaming faces many challenges such as peer unreliability and bandwidth heterogeneity. To effectively address these challenges, general “mesh” based P2P streaming architectures have recently been adopted. Mesh-based systems allow peers to aggregate bandwidth from multiple neighbors, and dynamically adapt to changing network conditions and neighbor failures. Howeve...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1710.06520  شماره 

صفحات  -

تاریخ انتشار 2017